Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Ignore Rate Limit for Whitelisted Clients #27

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

jay-parikh
Copy link

Ignore rate limit for whitelisted clients.

@italorossi
Copy link
Member

Hey @jay-parikh, thank you very much for your contribution, I really appreciate your effort but I'm afraid this change could be a little bit out of the scope of the library. The main purpose is to rate limit a client, which can be an IP address or a shared resource or whatever you need to control the access of. Adding functionality to not control something would be similar to not calling the rate_limit itself. I suppose that adding some logic in your code to decide whether you need or not to rate_limit would be easier than injecting this responsibility in our library.

Is there any other good reason to implement this feature that I'm missing here? Could you help me to understand more?

Thanks!

@jay-parikh
Copy link
Author

Thanks @italorossi for your prompt response. This rate limiting library really comes handy.

Whitelisted client feature comes handy when you have setup Trusted Network which is put behind some Firewall and is sharing same public ip address to communicate with your REST Api endpoint. In this case, we could whitelist client ip of Trusted Network. Which bypass rate limit of Api endpoints.

Or let us say we have an public Api endpoint and we want to rate limit each and every request except our own network (for development team)

Here, we have an option to keep whitelisted clients empty if we want to apply rate limit to each and every incoming request.

Generally if we have look at any popular rate limiting library of any language, whitelisting is a key feature. For instance, https://github.com/fastify/fastify-rate-limit is also having whitelist feature.

@victor-torres
Copy link
Contributor

Thank you for your contribution, @jay-parikh 🎉
It's a simple and straightforward implementation.

Although I have some considerations:

  • The client list is kept in memory, not sure if that scales depending on the number of clients you want to store.
  • Also, how are you acquiring that list? Are you querying a relational database? Isn't it slow compared to Redis?
  • Maybe this list could be stored on Redis and the library could have methods to add/remove clients to/from it.

But anyway:

  • If you have access to a list of clients that shouldn't be bypassed, why don't you just skip rate limit for them?

If you're retrieving max requests for a given client from a database (e.g. Redis or Postgres), yet another interesting approach could be using math.inf as the max_requests argument. This way you could keep track of the client's requests and have a usage history if you'd like to revoke this privilege later. But I don't know if this could be very useful...


Actually, since you've said that you're trying to unconditionally allow requests coming from your own network, you could create a function that returns a boolean value checking if the client's IP address is part of your network (using a netmask, for example). That could be more efficient than having to maintain a list of allowed IP addresses and it's quite feasible depending on your network topology. Based on this function's return you could decide whether to call the rate limit function or not.

I personally agree with @italorossi. I believe this decision is very particular to your business logic and use case. I wouldn't like to force users of this library to adhere to a specific pattern on this matter, especially when it seems to be very simple to come up with your own logic here.

@victor-torres
Copy link
Contributor

victor-torres commented Jul 16, 2020

@jay-parikh @italorossi

What do you guys think about skipping the rate limit check if max_requests is None?

  • it's a standard way to bypass the limit check
  • it still leaves the decision process on the user's hands

Usage example:

from redis_rate_limit import RateLimit, TooManyRequests

ALLOWED_CLIENTS = ('127.0.0.1',)
max_requests = None if client in ALLOWED_CLIENTS else 10

try:
    with RateLimit(resource='users_list', client='192.168.0.10', max_requests=max_requests):
        return '200 OK'
except TooManyRequests:
    return '429 Too Many Requests'

Copy link
Contributor

@victor-torres victor-torres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I'm sorry, @jay-parikh, I'm starting to see value in your implementation here. It could indeed be very useful when you have a small list of allowed clients (localhost, internal IPs etc.) as that could simplify your business logic without adding much complexity to the library itself.

Left some comments.

@italorossi, would you like to re-evaluate it?

@@ -43,21 +43,23 @@ class RateLimit(object):
This class offers an abstraction of a Rate Limit algorithm implemented on
top of Redis >= 2.6.0.
"""
def __init__(self, resource, client, max_requests, expire=None, redis_pool=REDIS_POOL):
def __init__(self, resource, client, max_requests, whitelisted_clients=[], expire=None, redis_pool=REDIS_POOL):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should think about a better name here since there's a movement to avoid some terms in software development. Maybe ignore_clients or bypass_clients.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to ignored_clients=None

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -88,7 +90,6 @@ def get_usage(self):
"""
Returns actual resource usage by client. Note that it could be greater
than the maximum number of requests set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've removed a bunch of blank lines from docstrings. Could you revert it, please?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes reverted.

current_usage = self._redis.evalsha(
INCREMENT_SCRIPT_HASH, 1, self._rate_limit_key, self._expire, increment_by)
if self.client not in self.whitelisted_clients:
current_usage = self._redis.evalsha(
INCREMENT_SCRIPT_HASH, 1, self._rate_limit_key, self._expire, increment_by)
else:
current_usage = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't touch this code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my last comment and add this check at the top like if self.ignored_clients and self.client in self.ignored_clients: return 0

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes done as requested.

:param increment_by: The count to increment the rate limiter by.
This is by default 1, but higher values are provided for more flexible
rate-limiting schemes.

:return: integer: current usage
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the first line here could be:

if self.client in self.bypass_clients:
    return 0

The same could be done to other methods since they don't even need to query Redis if the client is supposed to be bypassed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

expire=self.expire,
redis_pool=self.redis_pool,
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this change.

tests/rate_limit_test.py Outdated Show resolved Hide resolved
@victor-torres
Copy link
Contributor

If we're doing this, we need to add documentation specifying that the use case intended is with a very short list of allowed IPs. It's not suited for large sets of clients.

@italorossi
Copy link
Member

Sorry for the delay!

Yes, the implementation is simple enough to be considered. The Fastify example is a little bit different since it's a plugin to the web framework but anyway...

We can go ahead and implement that with the following changes:

  1. whitelisted_clients=[] - two problems here, the name and we shouldn't be using an empty list as default arguments :) suggestion: ignored_clients=None
  2. add the ignored_clients check at increment_usage at the top of this function, something like if self.ignored_clients and self.client in self.ignored_clients: return 0
  3. please revert all the empty lines removals

About the max_requests I'd prefer to not change it and leave it as a required integer.

@@ -43,21 +43,23 @@ class RateLimit(object):
This class offers an abstraction of a Rate Limit algorithm implemented on
top of Redis >= 2.6.0.
"""
def __init__(self, resource, client, max_requests, expire=None, redis_pool=REDIS_POOL):
def __init__(self, resource, client, max_requests, whitelisted_clients=[], expire=None, redis_pool=REDIS_POOL):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to ignored_clients=None

current_usage = self._redis.evalsha(
INCREMENT_SCRIPT_HASH, 1, self._rate_limit_key, self._expire, increment_by)
if self.client not in self.whitelisted_clients:
current_usage = self._redis.evalsha(
INCREMENT_SCRIPT_HASH, 1, self._rate_limit_key, self._expire, increment_by)
else:
current_usage = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my last comment and add this check at the top like if self.ignored_clients and self.client in self.ignored_clients: return 0

@jay-parikh
Copy link
Author

Sorry for the delay!

Yes, the implementation is simple enough to be considered. The Fastify example is a little bit different since it's a plugin to the web framework but anyway...

We can go ahead and implement that with the following changes:

  1. whitelisted_clients=[] - two problems here, the name and we shouldn't be using an empty list as default arguments :) suggestion: ignored_clients=None
  2. add the ignored_clients check at increment_usage at the top of this function, something like if self.ignored_clients and self.client in self.ignored_clients: return 0
  3. please revert all the empty lines removals

About the max_requests I'd prefer to not change it and leave it as a required integer.

Thanks for your valuable feedback. I have done all suggested changes. Please review and let me know.

Copy link
Contributor

@victor-torres victor-torres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still minor changes to be reverted and I've asked for an opinion from @italorossi regarding the right place to put the check.

@@ -88,7 +92,6 @@ def get_usage(self):
"""
Returns actual resource usage by client. Note that it could be greater
than the maximum number of requests set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blank line is still removed, could you revert it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you pushed those updates?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry forgot to push changes. Updated.

expire=self.expire,
redis_pool=self.redis_pool,
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, revert this change leaving a blank line at the end of the file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -88,7 +92,6 @@ def get_usage(self):
"""
Returns actual resource usage by client. Note that it could be greater
than the maximum number of requests set.

:return: integer: current usage
"""
return int(self._redis.get(self._rate_limit_key) or 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the check should be done here.

Suggested change
return int(self._redis.get(self._rate_limit_key) or 0)
if self.ignored_clients and self.client in self.ignored_clients:
return 0
return int(self._redis.get(self._rate_limit_key) or 0)

There are two reasons for it:

  • adding a client to the list has always an instant bypass effect
  • removing a client from the list automatically enforces the limits again

The reason is to avoid edge cases like this:

  • client reaches max requests limit
  • client is added to the ignored clients list
  • client will still need to wait for expire time before being allowed to make requests again

Or this other one:

  • client is added to ignored list by mistake and overflows max number of requests
  • client is removed from the list
  • there's no back-off

What do you think, @italorossi?

@jay-parikh
Copy link
Author

@victor-torres If every thing is all right here then can we merge this pull request?

@victor-torres
Copy link
Contributor

@victor-torres If every thing is all right here then can we merge this pull request?

I'm just waiting for @italorossi's input regarding moving the verification logic to another method.
See #27 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants